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AMENDMENTS AND LISTING OF CLAIMS 

Please amend the following c laims: 

1 . (AMENDED) A met}: od for generating training data that can be used with 
statistical models to normalize abbreviations in text, including: 

providing a corpus of text inc luding expansions of the abbreviations to be normalized; 
identifying processing the co t -pus of text to identify the expansions in the corpus of 
text; 

g e nerating pro cessing the corpus of text to generate context information describing 

the context of the text in which the expansions were identified; and 
storing training data as a function of the generated context information. 

2. (Original) The method of claim 1 wherein: 

generating context information includes generating local context information; and 
storing training data includes storing local context data as training data. 

3. (Original) The method of claim 2 wherein the local context information and 
local context training data includes sentence level information. 

4. (Original) The method of claim 3 wherein the sentence level information 
includes words in a sentence in which the identified expansion is located. 

5. (Original) The method of claim 1 wherein: 

generating context information includes generating discourse context information; 
and 

storing training data includes storing discourse context data as training data. 

6. (Original) The method rf claim 5 wherein the discourse context information 
and discourse context training data include text section level information. 
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7. (Original) The methoc of claim 1 wherein: 

generating context information includes generating local context information and 

discourse context info:mation; and 
storing training data includes storing local context data and discourse data as training 

data. 

8. (Original) The method of claim 1 wherein storing the training data includes 
storing a set of feature vectors, each feature vector including the context information 
generated for the associated expansicn identified in the corpus of text. 

9. (Original) The method of claim 8 and further including processing text using a 
Maximum Entropy model and the stcred feature vectors to normalize abbreviations in the 
text. 

10. (Original) The method of claim 8 wherein each feature vector further includes 
the abbreviation and associated expansion. 

1 1. (Original) The method of claim 1 and further including processing text using a 
statistical model and the stored training data to normalize abbreviations in the text. 

12. (Original) The method of claim 1 wherein: 

the method further includes providing stored abbreviation data representative of 
abbreviations and associated expansions for which training data is to be 
generated; and 

identifying the expansions incudes processing the corpus of text as a function of the 
stored abbreviation data. 
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13. (Original) A method J or electronically generating feature vectors that can be 
used in connection with electronic d ita processing systems implementing statistical models 
to normalize abbreviations in text, ir eluding: 

providing a database of abbreviation data representative of abbreviations and 

associated expansions to be normalized; 
providing a database having £. coipus of text including expansions of the 

abbreviations to be no malized; 
processing the corpus of text *s a function of the abbreviation data to identify the 

expansions in the corpus of text; 
generating context information describing the context of the text in which the 

expansions were ident; fled; and 
storing a set of feature vecton , each feature vector including the context information 
generated for the associated expansion identified in the corpus of text. 

14. (Original) The method of claim 1 3 wherein the feature vectors include local 
level context information and discouise level context information. 

1 5. (Original) The method of claim 1 3 and further including operating an 
electronic data processing system implementing a statistical model and the stored set of 
feature vectors to normalize abbreviaiions in the text. 
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